In [1]:
from IPython.display import display, Image, HTML
from talktools import website, nbviewer
IPython is an open source, interactive computing environment for Python and other languages.
In [2]:
website('http://ipython.org')
Out[2]:
IPython.parallel
: interactive parallel computingSee the following talk on IPython.parallel
by Min Ragan-Kelley
In [3]:
import ipythonproject
In [4]:
ipythonproject.core_devs()
The IPython Notebook is a web-based interactive computing environment that spans the full range of computing related activities:
How does IPython target these different activities?
The central focus of IPython is the writing and running of code. We try to make this as pleasant as possible:
Let's use NumPy and Matplotlib to look at the eigenvalue spacing distribution of random matrices:
In [5]:
%matplotlib inline
In [6]:
import matplotlib.pyplot as plt
import seaborn
import numpy as np
ra = np.random
la = np.linalg
In [7]:
def GOE(N):
"""Creates an NxN element of the Gaussian Orthogonal Ensemble"""
m = ra.standard_normal((N,N))
m += m.T
return m/2
def center_eigenvalue_diff(mat):
"""Compute the eigvals of mat and then find the center eigval difference."""
N = len(mat)
evals = np.sort(la.eigvals(mat))
diff = np.abs(evals[N/2] - evals[N/2-1])
return diff
def ensemble_diffs(num, N):
"""Return num eigenvalue diffs for the NxN GOE ensemble."""
diffs = np.empty(num)
for i in range(num):
mat = GOE(N)
diffs[i] = center_eigenvalue_diff(mat)
return diffs/diffs.mean()
In [8]:
diffs = ensemble_diffs(1000,30)
In [9]:
plt.hist(diffs, bins=30, normed=True)
plt.xlabel('Normalized eigenvalue spacing s')
plt.ylabel('Probability $P(s)$')
Out[9]:
Common shell commands (ls
, cd
) just work:
In [10]:
ls
Manage small files in the notebook using the %%writefile
magic command:
In [11]:
%%writefile data/mydata.csv
0 1 2 3 4 5 6 7 8 9 10
Any command prefixed with the !
is run in the system shell:
In [12]:
!cat data/mydata.csv
What does this have to do with parallel computing?
The canonical user interface to clusters and supercomputers is a terminal session over SSH. Ouch. This is extremely painful for the user and makes it almost impossible to reproduce the workflow. Here is a simple recipe for making parallel computing reproducible and literate:
Scientific computing is a multi-language activity. Python, C, C++, Fortran, Perl, Bash, etc. The IPython architecture is language agnostic.
For statistical computing, R is a great option. Let's fit a linear model in R and visualize the results:
In [13]:
import numpy as np
X = np.array([0,1,2,3,4])
Y = np.array([3,5,4,6,7])
%load_ext rmagic
The %%R
syntax tells IPython to run the rest of the cell as R code:
In [14]:
%%R -i X,Y -o XYcoef
XYlm = lm(Y~X)
XYcoef = coef(XYlm)
print(summary(XYlm))
par(mfrow=c(2,2))
plot(XYlm)
This %%language
syntax is an IPython specific extension to the Python language. This "magic command syntax" allows Python code to call out to a wide range of other languages (Ruby, Bash, Julia, Fortran, Perl, Octave, Matlab, etc.)
In [15]:
%%ruby
puts "Hello from Ruby #{RUBY_VERSION}"
In [16]:
%%bash
echo "hello from $BASH"
In the IPython architecture, the kernel is a separate process that runs the user's code and returns the output back to the frontend (Notebook, Terminal, etc.). Kernels talk to frontends using a well documented message protocol (JSON over ZeroMQ and WebSockets). The default kernel that ships with IPython knows how to run Python code. However, there are now kernels in other languages:
By later this year, all users of the IPython Notebook will have the option to choose what type of kernel to use for each Notebook.
Here is a notebook that runs code in the native Julia kernel:
In [17]:
website("http://nbviewer.ipython.org/url/jdj.mit.edu/~stevenj/IJulia%20Preview.ipynb")
Out[17]:
Notebook documents are just JSON files stored on your filesystem. These files store everything related to a computation:
Notebook documents can be shared:
Notebook documents can be viewed by anyone on the web through http://nbviewer.ipython.org
In [18]:
website("http://nbviewer.ipython.org")
Out[18]:
This allows people to compose and share reproducible stories that involve code and data.
Earlier this year, Randall Munroe (xkcd) published a comic about regular expression golf. Peter Norvig from Google wanted to explore some of the algorithms related to this comic and shared his explorations as a notebook on nbviewer:
In [20]:
website("http://nbviewer.ipython.org/url/norvig.com/ipython/xkcd1313.ipynb")
Out[20]:
IPython has a display system for rich output formats. This rich display system allows Python objects to declare non-textual representations that can be displayed in the Notebook. These rich representations include:
These rich representaions are displayed using IPython's display
function:
In [21]:
from IPython.display import HTML, Image, YouTubeVideo, Audio, Latex
Here is an Image
object whose representation is an image:
In [22]:
i = Image('images/ipython_logo.png')
In [23]:
display(i)
The Audio
object has a representation that is an HTML5 audio player:
In [24]:
a = Audio('data/Bach Cello Suite #3.wav')
In [25]:
display(a)
The Latex
object produces a representation that is rendered LaTeX. In this case, Maxwell's equations:
In [26]:
Latex(r"""\begin{eqnarray}
\nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\
\nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\
\nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\
\nabla \cdot \vec{\mathbf{B}} & = 0
\end{eqnarray}""")
Out[26]:
The YouTubeVideo
object embeds the HTML for a YouTube video in the notebook:
In [27]:
YouTubeVideo('sjfsUzECqK0')
Out[27]:
Data exploration is an iterative process that involves repeated passes at visualization, interaction and computation:
In [28]:
Image('images/VizInteractCompute.png')
Out[28]:
Right now this cycle is still really painful:
For IPython 2.0 we have built an architecture that allows Python and JavaScript to communicate seamlessly and in real time. This allows users to easily iterate through this cycle.
In this example, we will perform some basic image processing using scikit-image.
In [29]:
from IPython.html.widgets import *
In [30]:
import skimage
from skimage import data, filter, io
In [33]:
i = data.coffee()
io.Image(i)
Out[33]:
In [34]:
def edit_image(image, sigma=0.1, r=1.0, g=1.0, b=1.0):
new_image = filter.gaussian_filter(image, sigma=sigma, multichannel=True)
new_image[:,:,0] = r*new_image[:,:,0]
new_image[:,:,1] = g*new_image[:,:,1]
new_image[:,:,2] = b*new_image[:,:,2]
new_image = io.Image(new_image)
display(new_image)
return new_image
Calling the function once, displays and returns the edited image:
In [35]:
new_i = edit_image(i, 0.5, r=0.5);
In [36]:
lims = (0.0,1.0,0.01)
interact(edit_image, image=fixed(i), sigma=(0.0,10.0,0.1), r=lims, g=lims, b=lims);
Let's explore the Lorenz system of differential equations:
$$ \begin{aligned} \dot{x} & = \sigma(y-x) \\ \dot{y} & = \rho x - y - xz \\ \dot{z} & = -\beta z + xy \end{aligned} $$This is one of the classic systems in non-linear differential equations. It exhibits a range of different behaviors as the parameters ($\sigma$, $\beta$, $\rho$) are varied.
In [37]:
from IPython.html.widgets import interact, fixed
from IPython.display import clear_output, display, HTML
Here is a Python function that solves the Lorenz systems using SciPy and plots the results using matplotlib:
In [38]:
from lorenz import solve_lorenz
In [39]:
t, x_t = solve_lorenz(N=10, angle=0.0, max_time=4.0, sigma=10.0, beta=8./3, rho=28.0)
Let's use interact
to explore this function:
In [40]:
interact(solve_lorenz, angle=(0.,360.), N=(0,50), sigma=(0.0,50.0),
rho=(0.0,50.0), beta=fixed(8./3));
In [41]:
%load_ext load_style
In [ ]:
%load_style talk.css
In [ ]: